A novel computational method enables RNA editome profiling during human hematopoiesis from scRNA-seq data

RNA editing is a post-transcriptional modification with a cell-specific manner and important biological implications. Although single-cell RNA-seq (scRNA-seq) is an effective method for studying cellular heterogeneity, it is difficult to detect and study RNA editing events from scRNA-seq data because of the low sequencing coverage. To overcome this, we develop a computational method to systematically identify RNA editing sites of cell types from scRNA-seq data. To demonstrate its effectiveness, we apply it to scRNA-seq data of human hematopoietic stem/progenitor cells (HSPCs) with an annotated lineage differentiation relationship according to previous research and study the impacts of RNA editing on hematopoiesis. The dynamic editing patterns reveal the relevance of RNA editing on different HSPCs. For example, four microRNA (miRNA) target sites on 3ʹ UTR of EIF2AK2 are edited across all HSPC populations, which may abolish the miRNA-mediated inhibition of EIF2AK2. Elevated EIF2AK2 may thus activate the integrated stress response (ISR) pathway to initiate global translational attenuation as a protective mechanism to maintain cellular homeostasis during HSPCs’ differentiation. Besides, our findings also indicate that RNA editing plays an essential role in the coordination of lineage commitment and self-renewal of hematopoietic stem cells (HSCs). Taken together, we demonstrate the capacity of scRNA-seq data to exploit RNA editing events of cell types, and find that RNA editing may exert multiple modules of regulation in hematopoietic processes.

is a challenge to detect RNA editing events from scRNA-seq data because it has limited sequencing coverages. To overcome this, firstly we integrated the aligned reads of each cell of the same cell type according to the barcode tag in bam files to increase the sequencing coverages, which were merged to a 'Bulk RNA-seq' data called pseudo-Bulk RNA-seq data (Fig. 1A) (see details in "Methods" section). Secondly, we improved the method for PCR duplicates removal: only the reads which were aligned to the same genomic coordinate from the same cell with the same unique molecular identifier (UMI) were considered as PCR duplicates (Fig. 1B). Third, as the scRNA-seq data is strand-specific, we split the aligned reads and detect the RNA editing sites of genes transcribed from forward and reverse strands, respectively. In addition, according to the RED-ML method and the features of RNA editing, the sites located in the ALU gene element and the sites mutated as A-G are more likely to be RNA editing sites (Fig. 1C). We applied the improved method to the scRNA-seq data annotated into 8 HSPC populations and found that the improved method could retain more available reads (Fig. 1D).
Hematopoietic cell lineage and cell type identification. HSPCs is important for maintaining the integrity of the hematopoietic system. To evaluate our method, we used the scRNA-seq data of human HSPCs with defined cell types and lineage relationships according to previous research ( Fig. 2A,B). To investigate the conversed RNA editing events of human HSPCs, we collected cord blood (CB) and mobilized peripheral blood (mPB) CD34 + cells and obtained their single-cell transcriptomes. Mobilized peripheral blood (mPB) and umbilical cord blood (CB) are two of the current sources of HSPCs. Researches show that CD34 + cells derived from CB have stronger regeneration capacities than that from mPB. These samples included 6 CB samples without in vitro cultivation, 6 CB samples with 48-h in vitro cultivation, 3 mPB samples without in vitro cultivation and 3 mPB samples with 48-h in vitro cultivation (Supplemental Table 1, Supplemental Fig. S1A). We can find the conversed RNA editing sites in these different samples and cell culture states. We used a previous dataset with accession code CNP0000978 42 as reference to identify cell types in our data (Fig. 2B). MLP, MEMP, CMP, GMP, MEP and ProB were annotated in the query datasets based on the expression of marker genes reported in other research (Supplemental Fig. S1B) 42 . Then by integrating the reference and query datasets, we obtained 32,303 cells with specific gene expression profiles (  Table 2). As account for only 0.34% (109 of 32,303) of our data, we removed ProBs in further analysis. After generating the pseudo-Bulk RNA-seq data of these cell types, we reconstructed their lineage relationships using unsupervised hierarchical clustering (Fig. 2E), which is consistent with the classical model of lineage determination in human hematopoietic hierarchy ( Fig. 2A) [42][43][44] . HSC and MPP were found to be clustered together. Myeloid and lymphoid hematopoietic cell populations showed close clustering distinctions, respectively. Taken together, these results demonstrated that our scRNA-seq data exhibited the transcriptome features of human HSPCs. www.nature.com/scientificreports/ Dynamic RNA editing events in hematopoietic cell populations. Next we used RED-ML 37 to detect RNA editing sites using the pseudo-Bulk RNA-seq data of each cell type. To reduce false positives, we only retained RNA editing sites with sequencing depths greater than 30×. Then a total of 17,841,15,367,11,555,45,626,19,325,9,384,3,742 and 27,094 editing sites in the eight hematopoietic cell populations after filtering (Fig. 3A), and 98,549 unique RNA editing sites among eight hematopoietic cell populations. The investigation of the distance between two adjacent editing sites revealed that global RNA editing density was stable across the eight hematopoietic populations (Supplemental Fig. S2A). There was no significant correlation between the total number of non-redundant reads, the number of cells, and the number of identified RNA editing sites, indicating that the differences in RNA editing events across HSPCs were not due to technical issues (Fig. 3B,C). In addition, consistent with previous reports 6 , most of the RNA editing sites (94%) were located in the ALU element (Fig. 3D, Supplemental Fig. S2B). To further characterize the RNA editing sites in HSPCs, we investigated the distribution of RNA editing sites in different functional regions of the genome. Most of the RNA editing sites were located in introns (64%) and 3ʹ UTR (17%), and only 1% of the sites were in the coding region (Fig. 3E, Supplemental  Fig. S2C), which is in agreement with a previous study 14 .
As RNA editing events are dynamic between cell types (Fig. 3A), to build a comprehensive RNA editing landscape of the eight HSPC populations, we collected RNA editing sites whose read coverage was > 30× in all cell types. A fraction of editing sites (termed shared) was edited in all 8 cell types or specifically edited in lymphoid or myeloid or HSC/MPP hematopoietic cells, while another fraction of editing sites (termed specific) was specifically edited in one cell type (Fig. 3F, Supplemental Tables 3, 4). We then investigated their biological functions and found that RNA editing sites with the above patterns were enriched in different pathways (Fig. 3F). RNA editing may involve in embryonic erythropoiesis 45 and adult HSPC differentiation 46,47 . We found the shared RNA editing sites and specific RNA editing sites both enriched in the pathway relevant with regulation of hematopoietic stem/progenitor cells differentiation and cell cycle (Fig. 3F), which has been proved in several previous studies that cell cycle may have influence on HSPCs differentiation 20,48 . Those results suggest that RNA editing may regulate the biological processes of HSPCs in multiple ways. In previous studies, it was found that different RNA editing sites may have co-editing modules 14 . RNA editing sites within the same module are highly correlated. Consistent with these studies, we also identified co-editing modules among different editing sites (Fig. S3A). And different modules may play a role in different cell types (Fig. S3B). The functions of these modules were significantly enriched in the regulation of hematopoietic stem cell differentiation, lineage commitment, stem cell population maintenance, transcription initiation, protein assembly and other HSPC-related biological processes (Fig. 3G, Supplemental Fig. S3C). Therefore, this result suggested that these co-edited modules may have biological functions rather than random events. Together, these observations suggested that RNA editing might play specific roles in different cell types, and might be key to lineage differentiation and self-proliferation during hematopoiesis. Some co-edited modules that occurred in the same biological process may serve as another way for regulating human hematopoiesis.  Table 5). To explore the role of RNA editing sites in hematopoietic processes, GO enrichment analysis was performed on 224 genes. GO enrichment analysis revealed that EIF2AK2, LMO2, N4BP2L2, and RUNX1 were involved in regulating the differentiation of hematopoietic stem/progenitor cells and the RNA editing sites of those genes located in 3ʹ UTR and exon which may influence the miRNA binding sites and protein translation. (Fig. 4B, Supplemental Table 6). Among them, EIF2AK2 was reported to regulate responses to stress during hematopoiesis 49 . In our data we identified 37,103,288, 37,103,292, 37,103,360, 37,103,378 four RNA-editing sites in EIF2AK2 3ʹ UTR, which may affect miRNA binding (Fig. 4B,C). Prediction of miRNA targeting sites using TargetScanHuman 50 revealed that the binding sites of miR-23a/miR-23b in EIF2AK2 3ʹ UTR were altered due to RNA editing (Fig. 4D, Supplemental Fig. S4A). As miR-23a/miR-23b were expressed in CD34 + cells 51 , we hypothesized that RNA editing may prevent the binding of miR-23a/miR-23b to EIF2AK2, thereby derepress EIF2AK2. Then we tested this hypothesis and found that as expected, for cells expressing EIF2AK2, the overall expression levels of EIF2AK2 were higher in the cells when EIF2AK2 was edited (log 2 foldchange: 0.56; P-value < 0.0001) (Fig. 4E). And we found that EIF2AK2 editing occurred only in a small fraction of cells (3.11%) (Supplemental Fig. S4B), which was in agreement with a previous study 52 .
Cell pool integrity is maintained in the blood system by clearing damaged cells. But in order to ensure the longevity of the cells 53,54 , it is also necessary to ensure that the cells can survive the lower-level stress that often occurs 49 . Many stressors congregate on the integrated stress response (ISR) pathway, whose function is to balance the pressure signals that activate the cell death pathway in order to protect the cells to restore the cell balance 55 . PKR encoded by EIF2AK2 is one of the four stress-inducible kinases that maintain the survivaldeath equilibrium 49 . The interaction between PKR and the other three kinases (HRI, PERK, GCN2) affects the phosphorylation of eIF2α 56 , leading to the attenuation of global translation initiation, which saves amino acid synthesis for essential cellular function, reduces the load of chaperones, and lowers the metabolic demand associated with protein synthesis 57 . Along with the up-regulation of EIF2AK2, We found the expression levels of EIF2AK1, EIF2AK3 and EIF2AK4 (encoding HRI, PERK and GCN2, respectively) were all higher in cells expressing edited EIF2AK2 compared to those expressing unedited EIF2AK2 (Supplemental Fig. S4C), implying that RNA editing of EIF2AK2 may affect the maintenance of cell homeostasis. In addition, we compared the expression levels of EIF2AK2 in eight cell types carrying EIF2AK2 edited sites with non-edited cells. We found that the expression levels of EIF2AK2 in cells with edited sites were higher than those in cells without edited except GMP with no significant difference (Supplemental Fig. S5A). Meanwhile, previous studies found that human  www.nature.com/scientificreports/ HSCs are sensitive to the interference of cell homeostasis. The induction of reactive oxygen species (ROS) and accumulation of DNA damage increase the possibility of apoptosis of HSCs compared to downstream progenitor cells 58,59 . A study has found high integrated stress response activity in HSC/MPPs compared to progenitors which have lower differentiation potential 49 . Therefore, we hypothesized that there is a relationship between the activity of ISR pathway and differentiation potency. To test this hypothesis, we used SCENT 60 to explore the effect of RNA editing on the differentiation potential of cells. The result also revealed a higher differentiation potential in EIF2AK2 edited cells (Fig. 4F). High differentiation potential was also shown in cells edited with EIF2AK2 among the 8 HSPC populations with significant differences (Supplemental Fig. S5B). Taken together, based on the above analysis, the RNA editing events in EIF2AK2 may play an important role in maintaining cell homeostasis. Besides, we speculated that EIF2AK2 may undergo RNA editing in cells with higher differentiation potential which may more sensitive to interference.

Mononuclear cell differentiation
To further explore the function of genes involved in myeloid and lymphoid HSPCs differentiation pathways, we focused on RNA editing sites located in the 3ʹ UTR and coding regions which may influence the post-translational modification and the function of proteins. We found that 19 and 32 genes were in the pathways regulating lymphoid HSPCs differentiation and myeloid HSPCs differentiation, respectively. And the RNA editing sites of those genes all occurred in the 3ʹ UTR and exon coding regions (Fig. 5D). Many of these genes play important functions in the hematopoietic system. For example, Caspase 8 encoded by CASP8 is a cysteine protease and a key mediator of apoptosis 61 . Recently, however, many studies have introduced some new aspects to it, citing their significance in cell development and differentiation. The study by Rebe et al. has proved that caspase 8 activity is required for the differentiation of peripheral blood monocytes into macrophages in myeloid cell lines 62 . Meanwhile, ZBTB1 regulates the development of lymphoid cell lines and myeloid cell lines. Siggs, Owen et al. have shown that a chemically induced mouse with Zbtb1 mutation has a complete and cell-intrinsic T cell deficiency. Besides this, other lymphoid cell lines were also partially impaired. The study shows that ZBTB1 may act as an important transcriptional regulator determining T cell development and lymphopoiesis 63 .
HSCs are the stem cells that give rise to other hematopoietic cells. We find 6169 sites distributed on 2424 genes specifically edited in HSCs (Fig. 5E). GO Enrichment analysis revealed that genes with RNA editing sites were involved in hematopoietic stem cell differentiation and hematopoietic progenitor cell differentiation, and the RNA editing sites of some genes are located in 3ʹ UTR or exon coding region (Fig. 5F, Supplemental Table 6), such as EIF2AK2, N4BP2L2. According to the GenCards database (https:// www. genec ards. org/), N4BP2L2 may be involved in the positive regulation of HSC proliferation and negative regulation of HSC differentiation (Table 1).
Our results showed that RNA editing occurred in genes important for regulating lineage commitment, cell differentiation, and self-renewal during hematopoiesis, suggesting an important role for RNA editing in the hematopoietic process.

Discussion
Differing from previous analysis of scRNA-seq mainly focusing on the gene expression levels, here we provided a novel computational method to study RNA editing using scRNA-seq, and applied it to interpreting hematopoietic differentiation and HSC self-renewal. To perform RNA editing analysis in scRNA-seq data, we made the following three improvements: (i) using cell annotation information to obtain pseudo-Bulk RNA-seq of different cell types; (ii) optimizing the marking duplicates method to increase the sequencing depth; and (iii) using strand information to detect RNA editing sites. Compared with other studies, the distribution of RNA editing sites across different genomic elements, and the proportion of RNA editing sites within ALU elements were consistent with previous research reports 14 . We then explore the shared RNA editing sites and specific RNA editing sites among different HSPC populations and found that RNA editing is dynamic between different cell types during HSPCs differentiation and selfrenewal, and has specific editing events for different cell types. However, our study did not further resolve the underlying mechanisms conferring to this dynamic change. It seems that RNA editing may serve as a molecular marker for different cell types and have implications for the performance of specific functions of different cell types. Since ADAR enzymes catalyze RNA editing events, there may be some differences in the expression of ADAR enzymes in different cell types, all of which need to be investigated in detail by additional work. Moreover, our study found that EIF2AK2 with the shared RNA editing sites in all HSPC cell populations may have an impact on maintaining self-homeostasis during HSPCs differentiation through ISR pathway. Though a study has found that ISR activation in HSCs/MPPs is more active than in hematopoietic progenitor cells 49 , our findings may indicate that the ISR pathway not only safeguards HSCs/MPPs but also affects all HSPCs with high differentiation potential. This implies that HSPCs with higher differentiation potential may be more sensitive to external stimulus stress. They perform specific regulation of their own cellular homeostasis through RNA editing to maintain the integrity of the hematopoietic system. Meanwhile, we found lymphoid-specific and myeloid-specific RNA editing sites may involve in lineage commitment and HSPCs differentiation. According to a previous study, RNA editing may influence lineage commitment during hematopoiesis. They found the frequency of RNA editing alters at the branch point of HSPCs differentiation 14 . Our results indicate that not only changes in editing frequency, but also specific RNA editing sites occur in different lineages during differentiated development. This may suggest that RNA editing sites occur at genes critical to lineage commitment. Besides shared sites, we found specific editing sites in HSCs may have functions on differentiation and self-renew. The capacity to self-renew and to differentiate into other hematopoietic cells are important features of HSCs 15 . This also suggests that RNA editing www.nature.com/scientificreports/ may play an important role in maintaining the balance of self-renewal and differentiation of HSCs. However, those results still require more experiments to verify the role of RNA editing for HSPCs. More broadly, our approach validates the feasibility and usability of RNA editing event studies using highthroughput scRNA-seq. And we showed the overall differences RNA editing of the HSPCs and the possible function. In summary, these efforts confirm the great potential and value of scRNA-seq for the study of biological process mechanisms. Though future studies will be required to confirm and clarify the role of RNA editing.

Methods
Accession numbers. The reference dataset (aligned reads) was downloaded from CNSA (https:// db. cngb. org/ cnsa/) of CNGBdb with accession code CNP0000978. Both the reference and query dataset were CD34 + cells obtained from human CB and mPB, and the method of cell cultivation, scRNA-seq libraries and pre-processing were as previously described 42 . The raw sequencing data of query dataset generated in this study was deposited in the CNGB Sequence Archive (CNSA; https:// db. cngb. org/ cnsa/) of China National GeneBank DataBase (CNG-Bdb) with accession code CNP0003367.
Ethics statement. This study was performed with the approval of the Institutional Review Board of BGI (BGI-IRB 16089-T3 and BGI-IRB 22090). All methods were performed in accordance with relevant quidelines and regulations.
Enrichment of CD34 + cells from human CB and mPB samples. Human CB and mPB samples were obtained from healthy donors with informed consent. We got mononuclear cells (MNCs) using centrifugation on Lymphoprep medium. CD34 Microbead kits and LS columns using MACS magnet technology (Miltenyi) were used for MNC enrichment for CD34 + cell selection. Downstream experiments were conducted on CD34 + cells after sorting.
Cell culture in vitro and scRNA-seq library. The fresh CD34 + cells were applied to cell culture in vitro or to single-cell RNA-seq (scRNA-seq). CD34 + cells were resuspended in SCGM medium (Cellgenix) using the following recombinant hematopoietic cytokines: recombinant human stem cell factor (rhSCF) 100 ng/ml, recombinant human thrombopoietin (rhTPO) 100 ng/ml, recombinant human fms-related tyrosine kinase-3 ligand (rhFlt3-L) 100 ng/ml and cultured in 24-well tissue culture plates at 37 °C in an atmosphere of 5% CO 2 for 48 h (Thermo Fisher). The DNBelab C4 platform was used to perform scRNA-seq. Single-cell suspensions were used for droplet generation, demulsification, microbead collection, reverse transcription, and cDNA amplification to generate barcode libraries. The manufacturer's protocol was used to construct indexed libraries. Qubi-tTM ssDNA Assay Kit (Thermo Fisher Scientific; #Q10212) was used to quantify the sequencing libraries. DNA nanoballs (DNBs) are loaded into a pattern nanoarray and sequenced at ultra-high throughput with the following read lengths used by the DIPSEQ T1 sequencer: 30 bp for read 1, inclusive of 10 bp cell barcode 1, 10 bp cell barcode 2 and 10 bp unique molecular identifier (UMI), 100 bp of transcript sequence for read 2 and 10 bp for sample index.
Mapping and annotating query datasets. To identify HSPC populations in CD34 + cells, we used "Mapping and annotating query datasets" method of Seurat package (v. 4.1.0) in R (v.4.0.5) [64][65][66][67] . This approach allows the comparison of the similarity of cells in the reference and target datasets, resulting in cell annotation of the target dataset with unknown cell types. The reference data set was obtained from CNGBdb with accession code CNP0000978 42 . Then the query dataset was mapped to the reference data for annotating cell types. First, we imported the final cell-gene expression matrix of the query dataset into the Seurat package to create a Seurat objects. Cells with fewer than 200 detected genes and for which the total mitochondrial gene expression exceeded 5% were removed. Besides, we used the IQR Method of Outlier Detection to remove cells with outlier gene number. Genes expressed in fewer than three cells were also removed.
Downstream analyses were also performed using Seurat package (v. 4.1.0). Normalizing SeuratObject used the NormalizeData function and the ScaleData function. The following functions were run together when mapping the query dataset to the reference dataset: FindIntegrationAnchors, TransferData, IntegrateEmbeddings, Projec-tUMAP. The FindAllMarkers function was then used to find the cell type-specific genes. Differential expression analysis was performed based on the Wilcoxon Rank Sum test. Finally, the merge function of Seurat was used to merge the query dataset and reference dataset. All methods use default parameters unless otherwise specified.
The AverageExpression function of Seurat was used to calculate the average expression of 8 HSPC cell types. And then we used hclust function to perform the hierarchical clustering.
Splitting aligned reads based on cell type. To get pseudo-Bulk RNA-seq of 8 cell types, we got the barcode and sample information for each cell of the 8 cell types from SeuratObject. Based on the CB (Cell Barcode) tag in the bam file we can determine the cell from which the reads originated, so we can use the cell barcode of each cell type to obtain aligned reads for each cell barcode. And then, we integrated multiple cells aligned reads of each cell type to obtain a pseudo-Bulk RNA-seq for each cell type. The code of this method is available on Github (https:// github. com/ Genki-YAN/ Cell2 Editi ng).
Marking duplicates and obtaining strand-specific reads. Considering the difference between scRNA-seq data and Bulk RNA-seq data, we designed the procedure of marking duplicates: taking the aligned position, UMI barcode, and cell barcode of reads into account for the definition of duplicates. Only the reads www.nature.com/scientificreports/ from the same cell with the same unique molecular identifier (UMI) were considered as PCR duplicates. And we retained the read with the highest base quality. The code of this process is available on Github (https:// github. com/ Genki-YAN/ Cell2 Editi ng). We split strand-specific aligned reads based on flag 16 (read reverse chain) in bam file using Samtools 68 with the following commands: samtools view -b -f 16; samtools view -b -F 16.
We compared the number of available reads detected by the improved method with the Picard ("Picard Toolkit." 2019. Broad Institute, GitHub Repository. https:// broad insti tute. github. io/ picard/; Broad Institute) method. We counted and compared the non-duplicates reads in the bam files processed by the two methods. The tool locates and marks duplicates in BAM or SAM files, where repeat reads are defined as originating from a single DNA fragment. Duplication may occur during sample preparation, such as library construction using PCR. If the 5' position, strand and base alignment are the same, Picard compared base alignment quality to mark duplicate reads and available reads.
Detecting RNA editing sites. After the above pre-process of data, we used the RED-ML (https:// github. com/ BGIHE NG/ RED-ML) 37 with the default parameters to detect the RNA editing sites in the pseudo-Bulk RNA-seq using the GRCh38 genome and GRCh38 SNP database (https:// ftp. ncbi. nih. gov/ snp/ organ isms/ human_ 9606_ b151_ GRCh3 8p7/ VCF/ All_ 20180 418. vcf. gz), and removed RNA editing sites with coverage less than 30× to facilitate subsequent analysis. When we investigated cell type-specific RNA editing sites, we used the mpileup function of Samtools for investigating site sequencing coverage, keeping sites with coverage greater than 30× with the following commands: samtools mpileup -B -f -s -output-QNAME -min-MQ 20 -min-BQ 20 -excl-flags DUP.
Annotation of RNA editing sites. RNA editing sites were annotated utilizing ANNOVAR (https:// annov ar. openb ioinf ormat ics. org/ en/ latest/) 69 table_annovar.pl with the reference genome of GRCh38. We used the following commands: -remove -protocol refGene, phastConsElements20way, wgRna, cytoBand -operation g,r,r,r -nastring . -csvout -polish. Based on the strand of the aligned reads, we only retain the RNA editing sites with genes transcribed from the same strand.
Correlation module analysis of RNA editing sites. To identify the RNA editing sites that were coedited in different cell types, the co-editing modules of filtered RNA editing sites in 8 cell types were analyzed using WGCNA package (v. 1.70.3) 70,71 . The WGCNAR package was used to estimate the best soft thresholding power for the co-editing module analysis. The minimum power 14, which reached the R2 cut-off of 0.8 for topology model fit, was determined to be the optimal value. The adjacency with the optimal soft-thresholding power estimated above was calculated, and the adjacency was transformed into a topological overlap matrix to calculate the corresponding dissimilarity, and identified 44 co-editing modules with minModuleSize = 30. Then we annotated the RNA editing sites in each module and perform GO enrichment analysis using clusterProfiler R package (v.4.0.5) 72,73 .

Identification of cells with editing sites.
Since RED-ML is a method designed for RNA editing sites by Bulk RNA-seq, and there is no existing method to detect RNA editing sites in individual cells, we used the mpileup function of Samtools to find the aligned reads with RNA editing sites. The cells with at least 1 read supporting RNA editing events were defined as edited cells. Samtools mpileup was performed using the following parameters: samtools mpileup -B -f -s -output-QNAME -min-MQ 20 -min-BQ 20 -excl-flags DUP. To calculate the proportion of cells with edited EIF2AK2, we removed cells that did not express EIF2AK2.
Gene expression and functional analysis of RNA editing sites. The expression of genes with RNA editing sites in cells was quantified using ScaleData function of Seurat. To explore the function of editing sites with different editing patterns, we performed GO enrichment analysis using clusterProfiler package (v.4.0.5) 72,73 . Prediction of miRNA target sites. TargetScanHuman (https:// www. targe tscan. org/ vert_ 80/) 50 was used to predict miRNAs that may target and bind to candidate genes. Then we compared the binding region with the RNA editing sites to determine whether RNA editing sites are located in the miRNA binding region. The reference file of UTR sequence was downloaded from TargetScanHuman website. The version of RNA sequence used was GRCh38.
Estimating differentiation potency of single cells. To demonstrate that RNA editing may be correlated with differentiation potency, we first removed cells that do not express genes with the RNA editing, and then we calculated the differentiation potency of cells using SCENT (https:// github. com/ aet21/ SCENT) 60 with the default parameters. The P-value was calculated using T-Test.

Data availability
The datasets generated and analysed during the current study are not publicly available due to human genetic resources management and the data in a published article entitled "Stemness-related genes revealed by single-cell profiling of naïve and stimulated human CD34+ cells from CB and mPB" (https:// doi. org/ 10. 1002/ ctm2. 1175) being controlled but are available from the corresponding author on reasonable request.